This analysis applies the Bioconductor package fastseg to segment chromosomes based on numeric variable, such as DNA copy number and fold change of RNA transcription. Please refer to package manual for full package description. In summary, the fastseg package implements a fast and efficient segmentation algorithm, which is based on the cyber t-test (Baldi and Long, 2001). Segments identified by the algorithm are then summarized and compared to segments derived from randomized data, in terms of their frequency, length, size, and mean of the numeric variable (copy number, fold change, etc.).

 

Go to project home

1 Description

1.1 Project

Genetic Modifiers in Trisomy 21 Leukemogenesis.

1.2 Data

GEO public data set. RPKM and log2FC values are download from GSE55504.

1.3 Analysis

Log2-fold change between one pair of monozygotic twins (T2N_Rep0 vs. T1DS_Rep0). Goal is to regenerate figure 1a in the original paper.

2 Results

2.1 Summary

Parameters:

  • Variable name: logFC_Rep0

  • Data columns
    • variable: 9
    • chromosome: 1
    • start: 2
    • end: 3
    • strand: 0
  • Randomization
    • round: 10
    • chromosome: 1
  • Runtime options
    • minSeg: 3
    • type: 1
    • alpha: 0.1
    • delta: 5
    • squashing: 0
    • cyberWeight: 10
  • Segment selection
    • size: 1
    • min: 0
    • max: Inf
    • negpos: 0
    • top: 3

Table 1. Brief summary of inputs and outputs.

Description Value
Total number of loci 13130
Total number chromosomes 23
Range of values -1.08768 to 1.544644 (mean=-0.02018045)
Number of segments 88
Length of segments 6 to 736 (mean=149.2045)
Size of segments 6 to 736 (mean=149.2045)
Mean of segments -0.7273123 to 1.3129 (mean=0.009995079)

2.2 Segmentation

Figure 1. Global view of segmentation across all chromosomes (in alternative colors). Red lines indicate segment locations. Click here to download figures by individual chromosomes.

Figure 2. Distribution of logFC_Rep0: original values at all individual loci vs. segment means.

2.3 Segment selection

Selection of significant segments using given criteria.

Table 2. Summary of selected segments: location, length, size, and logFC_Rep0 at individual loci. Click links to get full list of loci within each segment and visualization of segmentation via Manhattan plot.

chromosome start end length size mean minimum maximum variance loci segmentation
segment_1 chr1 11869 67519782 67507914 669 -0.2356 -5.0429 2.1058 0.7586 table figure
segment_2 chr1 67873493 110252173 42378681 133 0.3954 -1.9398 2.6433 0.9357 table figure
segment_3 chr1 110230450 236647768 126417319 693 -0.0831 -6.6591 3.2144 0.9244 table figure
segment_4 chr1 236681300 249231242 12549943 41 0.3398 -2.0244 1.6013 0.8239 table figure
segment_5 chr2 217730 178417742 178200013 637 0.0195 -9.9074 3.1511 1.0335 table figure
segment_6 chr2 178467817 209130798 30662982 109 0.3909 -2.2458 1.9035 0.9505 table figure
segment_7 chr2 209130991 242708231 33577241 157 -0.2093 -3.7878 1.8398 0.8243 table figure
segment_8 chr3 3168600 49298744 46130145 239 -0.0373 -3.4099 2.2381 0.8272 table figure
segment_9 chr3 49306035 52931612 3625578 82 -0.3772 -1.8433 1.5280 0.6071 table figure
segment_10 chr3 53122499 197766105 144643607 427 0.2588 -2.6321 2.7871 0.8400 table figure
segment_11 chr4 53179 15739936 15686758 80 -0.2950 -4.6756 1.3071 0.8102 table figure
segment_12 chr4 15732585 190884359 175151775 281 0.2804 -5.2688 2.2872 0.9305 table figure
segment_13 chr5 140373 43313614 43173242 84 -0.0149 -4.2109 1.7708 0.9026 table figure
segment_14 chr5 43486803 68525956 25039154 42 0.6300 -1.0121 2.0873 0.7552 table figure
segment_15 chr5 68530668 79950802 11420135 45 0.3271 -1.9020 1.7760 0.8857 table figure
segment_16 chr5 80597409 122372436 41775028 68 0.5594 -3.5679 1.8324 0.9243 table figure
segment_17 chr5 122422943 180699168 58276226 284 -0.0864 -5.8822 2.6319 0.9784 table figure
segment_18 chr6 142272 27861669 27719398 132 0.0340 -4.3270 2.1064 1.0476 table figure
segment_19 chr6 28058932 31774761 3715830 92 -0.3850 -2.8898 1.1635 0.6662 table figure
segment_20 chr6 31777396 33285719 1508324 50 -0.1822 -1.6904 0.8911 0.4914 table figure
segment_21 chr6 33286335 36200567 2914233 39 -0.2889 -2.8836 1.1009 0.7150 table figure
segment_22 chr6 36410544 42110357 5699814 39 -0.2325 -1.9797 1.2070 0.6256 table figure
segment_23 chr6 42174539 49430904 7256366 53 -0.2190 -1.4217 1.1263 0.5879 table figure
segment_24 chr6 50880425 170893780 120013356 288 0.2259 -2.9970 2.4908 0.8390 table figure
segment_25 chr7 182935 6523821 6340887 66 -0.5892 -1.9985 0.9857 0.5858 table figure
segment_26 chr7 6485584 158622944 152137361 630 -0.0436 -4.8863 2.3520 0.8896 table figure
segment_27 chr8 163251 144780583 144617333 330 0.1205 -4.7609 3.3133 0.9468 table figure
segment_28 chr8 144766622 145669827 903206 35 -0.5995 -2.0064 0.4901 0.5857 table figure
segment_29 chr8 145660602 146281416 620815 21 -0.2837 -0.9778 0.9809 0.5057 table figure
segment_30 chr9 14511 100845357 100830847 270 -0.0467 -3.4990 2.9521 0.8789 table figure
segment_31 chr9 100831557 114697649 13866093 49 0.2709 -2.5319 2.1696 0.9515 table figure
segment_32 chr9 114680537 139932407 25251871 254 -0.3436 -3.5434 1.4583 0.6971 table figure
segment_33 chr9 139933922 140764468 830547 26 -0.5518 -1.4170 0.0027 0.3966 table figure
segment_34 chr10 180405 75879918 75699514 254 0.0745 -4.5955 2.0081 0.9168 table figure
segment_35 chr10 75855924 88994912 13138989 50 -0.3679 -2.9220 1.7058 0.8944 table figure
segment_36 chr10 88963610 135516024 46552415 242 0.0287 -4.9038 2.2052 0.8089 table figure
segment_37 chr11 127115 8023409 7896295 94 -0.2685 -1.9441 1.5486 0.5685 table figure
segment_38 chr11 8040791 35042138 27001348 81 0.3705 -2.0985 1.9089 0.8089 table figure
segment_39 chr11 35160417 77791265 42630849 341 -0.3055 -3.4892 2.6533 0.6016 table figure
segment_40 chr11 77811982 111901091 34089110 67 0.1643 -4.0761 1.9093 1.0813 table figure
segment_41 chr11 111895538 134135749 22240212 106 -0.1898 -3.3133 1.2688 0.7776 table figure
segment_42 chr12 73725 59314303 59240579 344 -0.0865 -2.9178 2.9253 0.7814 table figure
segment_43 chr12 60205883 108155049 47949167 118 0.5519 -3.2309 3.8832 0.8693 table figure
segment_44 chr12 108908962 133684130 24775169 170 -0.1258 -1.6405 2.1765 0.6655 table figure
segment_45 chr13 19271143 100803346 81532204 162 0.1912 -5.3103 2.1129 0.9980 table figure
segment_46 chr13 101183801 107220512 6036712 13 0.8684 -0.2380 2.1277 0.6221 table figure
segment_47 chr13 108859794 113863029 5003236 18 0.0994 -0.9881 1.6343 0.8313 table figure
segment_48 chr13 113862552 114542321 679770 7 0.0401 -0.4586 0.6164 0.3526 table figure
segment_49 chr13 114523524 115092796 569273 6 0.0191 -0.6241 0.9178 0.6902 table figure
segment_50 chr14 20724717 24809251 4084535 77 -0.0650 -1.2859 1.1092 0.5148 table figure
segment_51 chr14 24834879 61124977 36290099 92 0.5503 -2.3510 1.9080 0.8103 table figure
segment_52 chr14 61176246 104200005 43023760 192 -0.0197 -3.1312 2.0151 0.7476 table figure
segment_53 chr14 104378625 106445233 2066609 22 -0.6871 -1.9410 0.9272 0.6493 table figure
segment_54 chr15 20587869 44094787 23506919 118 -0.0894 -3.4076 2.7933 0.8240 table figure
segment_55 chr15 44085857 63434260 19348404 59 0.4016 -2.8855 1.6862 0.8999 table figure
segment_56 chr15 63418071 102516768 39098698 195 -0.1151 -2.0749 2.3433 0.7266 table figure
segment_57 chr16 64043 9058371 8994329 158 -0.4216 -2.7164 2.1695 0.5625 table figure
segment_58 chr16 9185505 23607677 14422173 75 -0.0231 -1.6434 1.4130 0.6469 table figure
segment_59 chr16 23614488 31124110 7509623 94 -0.3073 -2.0812 1.1823 0.5034 table figure
segment_60 chr16 31127075 56661024 25533950 41 0.0048 -1.3676 1.2314 0.5447 table figure
segment_61 chr16 56672578 90114181 33441604 232 -0.3607 -4.6861 1.5915 0.7586 table figure
segment_62 chr17 254326 43627701 43373376 523 -0.2931 -6.2040 1.8707 0.8156 table figure
segment_63 chr17 43685909 72869156 29183248 157 -0.1799 -7.0149 1.9524 1.2378 table figure
segment_64 chr17 72983727 81052864 8069138 141 -0.4125 -2.2784 1.3880 0.5363 table figure
segment_65 chr18 158383 9960018 9801636 25 0.2933 -0.7842 1.1832 0.5818 table figure
segment_66 chr18 10525902 12658133 2132232 8 -0.1958 -1.5229 1.3578 0.7995 table figure
segment_67 chr18 12658042 71959251 59301210 86 0.3406 -2.8142 1.8097 0.7774 table figure
segment_68 chr18 72057119 77905406 5848288 12 -0.2828 -0.8218 0.4579 0.4277 table figure
segment_69 chr19 197124 51222707 51025584 736 -0.4294 -6.4535 1.9152 0.7549 table figure
segment_70 chr19 51226586 54697585 3471000 46 0.0274 -4.0263 2.3708 0.9637 table figure
segment_71 chr19 54704610 57012035 2307426 41 -0.2374 -1.1677 0.9367 0.5544 table figure
segment_72 chr19 57050317 59111168 2060852 56 0.0269 -1.0557 0.9142 0.5086 table figure
segment_73 chr20 251504 5093749 4842246 62 -0.3828 -1.6459 1.0116 0.5421 table figure
segment_74 chr20 5095599 23335414 18239816 50 0.1748 -1.9154 2.0994 0.8692 table figure
segment_75 chr20 23342787 62731996 39389210 295 -0.2995 -4.6692 1.8472 0.7496 table figure
segment_76 chr21 11180920 27144771 15963852 12 0.7130 -1.3559 2.2494 1.0781 table figure
segment_77 chr21 27252861 30446118 3193258 13 1.1566 -1.6035 2.2754 1.1035 table figure
segment_78 chr21 30594642 48111157 17516516 117 0.3766 -3.0948 1.9731 0.8630 table figure
segment_79 chr22 16122720 18848562 2725843 18 -0.4249 -2.2664 1.0367 0.7856 table figure
segment_80 chr22 18893541 20380440 1486900 33 -0.7847 -6.2213 0.2963 1.1172 table figure
segment_81 chr22 20383524 31744670 11361147 134 -0.5006 -4.0952 0.8794 0.6977 table figure
segment_82 chr22 31795509 36817689 5022181 28 -0.2508 -1.6493 0.8554 0.6180 table figure
segment_83 chr22 36863083 51239737 14376655 194 -0.4652 -4.8443 1.2162 0.6848 table figure
segment_84 chrX 220013 100651105 100431093 268 -0.0664 -2.6021 1.9599 0.8579 table figure
segment_85 chrX 100652791 106243474 5590684 32 0.5902 -2.9172 2.3211 1.1507 table figure
segment_86 chrX 106374887 147134266 40759380 91 0.2378 -3.8162 2.7945 1.0509 table figure
segment_87 chrX 148558521 153237258 4678738 45 -0.4830 -1.8307 1.7309 0.7364 table figure
segment_88 chrX 153237778 154688276 1450499 34 -0.4345 -3.2665 1.0312 0.8347 table figure

2.4 Randomization

Repetitively use the same criteria to identify and select segments from 10 sets of randomized data and compare the summary statistics of selected segments.

Table 3. Means of summary statistics of segments identified and selected original data vs. multiple sets of randomized data: number of loci, segment length, mean and standard deviation of logFC_Rep0 of segments. If mean logFC_Rep0 of selected segments can be both positive and negative, their absolute values are used in this table.

size length mean variance
original 149.2045 33254541 0.29 0.78
random_1 78.6228 17433639 0.22 0.24
random_2 79.5758 17671882 0.21 0.24
random_3 73.3520 16272353 0.22 0.24
random_4 82.0625 18218122 0.22 0.25
random_5 78.6228 17426700 0.21 0.26
random_6 72.5414 16112650 0.23 0.24
random_7 76.3372 16888209 0.22 0.24
random_8 89.3197 19842450 0.21 0.25
random_9 75.4598 16744809 0.22 0.25
random_10 59.4118 13125150 0.22 0.24

Figure 3. Relationship between segment size and segment mean logFC_Rep0. Each dot represents a segment derived from the original real data (blue) and randomized data (grey).

Figure 4. Distribution of segment size compared between original and randomized data.

Figure 5. Distribution of segment length compared between original and randomized data.

Figure 6. Distribution of logFC_Rep0 mean of segments compared between original and randomized data.

Figure 7. Distribution of logFC_Rep0 standard deviation of segments compared between original and randomized data.

4 Appendix

Check out the RoCA home page for more information.

4.1 Reproduce this report

To reproduce this report:

  1. Find the data analysis template you want to use and an example of its pairing YAML file here and download the YAML example to your working directory

  2. To generate a new report using your own input data and parameter, edit the following items in the YAML file:

    • output : where you want to put the output files
    • home : the URL if you have a home page for your project
    • analyst : your name
    • description : background information about your project, analysis, etc.
    • input : where are your input data, read instruction for preparing them
    • parameter : parameters for this analysis; read instruction about how to prepare input data
  3. Run the code below within R Console or RStudio, preferablly with a new R session:

if (!require(devtools)) { install.packages('devtools'); require(devtools); }
if (!require(RCurl)) { install.packages('RCurl'); require(RCurl); }
if (!require(RoCA)) { install_github('zhezhangsh/RoCAR'); require(RoCA); }

CreateReport(filename.yaml);  # filename.yaml is the YAML file you just downloaded and edited

If there is no complaint, go to the output folder and open the index.html file to view report.

4.2 Session information

## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Matrix products: default
## BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] xlsx_0.6.1            vioplot_0.3.0         zoo_1.8-4            
##  [4] sm_2.2-5.6            fastseg_1.28.0        Biobase_2.42.0       
##  [7] GenomicRanges_1.34.0  GenomeInfoDb_1.18.1   IRanges_2.16.0       
## [10] S4Vectors_0.20.1      BiocGenerics_0.28.0   DEGandMore_0.0.0.9000
## [13] snow_0.4-3            htmlwidgets_1.5.1     DT_0.15              
## [16] kableExtra_0.9.0      awsomics_0.0.0.9000   yaml_2.2.1           
## [19] rmarkdown_1.10        knitr_1.20            RoCA_0.0.0.9000      
## [22] RCurl_1.95-4.11       bitops_1.0-6          devtools_2.3.1       
## [25] usethis_1.6.1        
## 
## loaded via a namespace (and not attached):
##  [1] httr_1.4.2             pkgload_1.0.2          jsonlite_1.6.1        
##  [4] viridisLite_0.3.0      assertthat_0.2.1       highr_0.7             
##  [7] xlsxjars_0.6.1         GenomeInfoDbData_1.2.0 remotes_2.2.0         
## [10] sessioninfo_1.1.1      pillar_1.4.6           backports_1.1.5       
## [13] lattice_0.20-38        glue_1.3.2             digest_0.6.25         
## [16] XVector_0.22.0         rvest_0.3.2            colorspace_1.4-1      
## [19] htmltools_0.4.0        pkgconfig_2.0.3        zlibbioc_1.28.0       
## [22] scales_1.1.1           processx_3.4.2         tibble_2.1.3          
## [25] ellipsis_0.3.0         withr_2.2.0            cli_2.0.2             
## [28] magrittr_1.5           crayon_1.3.4           memoise_1.1.0         
## [31] evaluate_0.14          ps_1.3.2               fs_1.3.2              
## [34] fansi_0.4.1            xml2_1.2.0             pkgbuild_1.1.0        
## [37] tools_3.5.1            prettyunits_1.1.1      hms_0.4.2             
## [40] lifecycle_0.2.0        stringr_1.3.1          munsell_0.5.0         
## [43] callr_3.4.3            compiler_3.5.1         rlang_0.4.5           
## [46] grid_3.5.1             rstudioapi_0.11        crosstalk_1.1.0.1     
## [49] tcltk_3.5.1            testthat_2.3.2         R6_2.4.1              
## [52] rprojroot_1.3-2        readr_1.3.1            desc_1.2.0            
## [55] rJava_0.9-11           stringi_1.2.4          Rcpp_1.0.5

END OF DOCUMENT